4D Analytics

Data Collector

Last updated: July 10, 2020

Widget size: 2 x 1, 2 x 2

Use the Data Collector widget to add data to the Amulet database. It can collect data in the following ways:

  • Manual user upload of file
  • Automatic collection via email attachments
  • Automatic collection from a specified FTP site
  • Automatic collection from a specified Folder

Data collector classes are written for specific customer requirements using a library of helper functions for extracting data from an imported files and inserting values into Amulet.

When creating a Data Collector widget, it is associated with a specific Data Collector class; every file received by the widget is processed by this class. The expected frequency of uploads can be specified and the user can receive an email of the status or notified if an upload is not received within the expected time frame. File types are generally Excel or csv, but can be any type of data.

Upload File

Manually upload a file to be queued for processing. Once the file has been uploaded, it will be displayed in the queue. As the file is processed, the progress bar and Progress Details provide real-time feedback.

Click the Details button to display errors generated during the Import process. You can download the file by clicking on the file link and saving the file. The number of files displayed on this screen is limited. To see every file processed, use the Old Uploads page.

Individual tasks can be deleted from the Upload File and the Old Uploads list, by clicking the Delete button in the Delete All row. Tasks will then be removed from both lists.

Clicking the Delete All button clears all tasks from both the Upload File and Old Uploads lists.

Old Uploads

Every file imported is available on this page.

Clicking Details brings up a pop-up containing details of the reason for failure.

Clicking Logs is an advanced feature available only to SuperUsers of the system. This provides access to logs generated from the Data Collector background processes.

Background Process

A background process does the following:

  • Feeds received files through the correct Data Collector class.
  • Reports progress and errors
  • Emails users the results from File Imports.
  • Emails users any Missed Uploads
  • Zip files received are uncompressed and queued for processing.

Automatic Email Collection

Files can be automatically imported by collecting emails. Every email attachment received is queued for processing by the Data Collector class.

The system keeps track of files already processed; no files are removed from the server.

  • Frequency – How often the email server is checked for new emails
  • Server – Hostname or IP address of the pop3 email server.
  • Username – Username to log into the email server.
  • Password – Password to log into the email server.
  • Use Secure – Uses a secure (SSL) connection to the email server, use only if the email server supports a secure connection (gmail, etc.)
  • Force Check Now – Forces a check of the email server immediately
  • Reset read emails – The system keeps track of which emails have been processed, this button makes the system forget all the emails it has processed and will re-import all available files
  • Reset Stats – Reset the collection statistics

Automatic FTP Collection

Import files automatically from an FTP server. Every file collected is queued for processing by the Data Collector.

  • Frequency – How often the ftp server is checked for new files
  • Server – Hostname or IP address of the FTP server
  • Username – Username to log into the ftp server
  • Password – Password to log into the ftp server
  • Collection Mode – When collecting files, there are a number of approaches to avoid repeatedly collecting the same file. The system keeps an internal record of files it has processed.
    • ARCHIVE_PROCESS_SEEN_FILES – All files in the directory will be processed and moved to the archive folder.
    • ARCHIVE_IGNORE_SEEN_FILES – Files that have not been processed before will be processed and moved to the archive folder.
    • NO_ARCHIVING_IGNORE_SEEN_FILES – Files that have not been processed before will be processed but not moved to the archive folder.
    • DELETE_PROCESS_SEEN_FILES – All files in the directory will be processed and then deleted.
  • FTP Directory – A sub-folder can be scanned for files if required.
  • Archive Folder – If files are being archived, they will be moved into this folder underneath the FTP Directory. If this folder does not exist, it will be created.
  • FTP Timeout (secs) – Timeout used when communicating with the FTP server.
  • Force Check Now – Forces a check of the FTP server immediately.
  • Reset read files – The system keeps track of files that have been processed. This button makes the system forget which files it has processed and it will re-import all available files from within the FTP Directory.
  • Reset Stats – Reset the collection statistics.

Automatic Folder Collection

Files can be automatically imported from a designated folder. Every file collected is queued for processing by the File Importer.

  • Frequency – How often the ftp server is checked for new files
  • Folder – Folder from which files for import are collected, the folder can be any folder that is available on the webserver (e.g., C:\FileImporter), or via a remote file share (e.g., \\192.168.1.33\filestore).
  • Connection mode – When collecting files, there are a number of approaches to avoid repeatedly collecting the same file. The system keeps an internal record of files it has processed.
    • ARCHIVE_PROCESS_SEEN_FILES – All files in the directory will be processed and moved to the archive folder.
    • ARCHIVE_IGNORE_SEEN_FILES – Files that have not been processed before will be processed and moved to the archive folder.
    • NO_ARCHIVING_IGNORE_SEEN_FILES – Files that have not been processed before will be processed but not moved to the archive folder.
    • DELETE_PROCESS_SEEN_FILES – All files in the directory will be processed and then deleted.
  • Archive Folder – If files are being archived, they will be moved into this folder underneath the configured Folder. If this folder does not exist, it will be created. The Archive Folder should just be the folder name, within the Folder, not the full path.
  • Force Check Now – Forces a check of the folder immediately
  • Reset read files – The system keeps track of files that have been processed. This button makes the system forget which files it has processed and re-import all available files.
  • Reset Stats – Reset the collection statistics

Notifications

The system can notify (email) people when certain events happen.

  • Overdue Frequency – When uploads are regular, you can set the frequency you expect them. If an upload is not received within the time frame the system will email the configured users notifying them of the Missing Upload. The options are: DISABLED, DAILY, WEEKLY, and MONTHLY.
  • Overdue Grace Period – If required, a grace period can be specified if the uploads are not always exactly within the period. The grace period is measured in intervals of the selected frequency.
  • Email on Success – Semi-colon separated email addresses to email when a file import is successful.
  • Email on Failure – Semi-colon separated email addresses to email when a file import failed.
  • Email on Missing – Semi-colon separated email addresses to email when there has been a Missed Upload.
  • Email on All – Semi-colon separated email addresses to email when any event occurs.

Settings

Each Data Collector may have its own specific settings that can be manipulated to configure the data collector. These differ for each Data Collector. Please refer to the specific data collector’s documentation for more details.

Below is an example of a settings screen for a Data Collector. These will be different for each Data Collector depending on what configuration settings that Data Collector requires.

There are a few settings that are common to most data collectors:

Name Default Value Comments
Allow standarddata updates True Overwrite existing values if they already exist for a particular timetag. Setting to False would import new values only.
Rollback dependent calcs True Rollback (remove) calculations' values that depend on changed values.
Max old uploads 1000 The maximum number of uploads and files to be retained.
Max allowed task run time 3600 The maximum duration (in seconds) that a task will run, before it is terminated.

Schedule

Tasks can be performed on a schedule from within the Bigguy framework. If the task is an Email collection, FTP collection, or Folder collection, you do not need to set the schedule as the collection frequency is already set up on that specific tab.

Note: For additional information, see Data Collection Framework.

This schedule is for all other Data Collectors that have their own method of connecting to the data source, such as a direct database connection.

  • Base Time – This sets the time at which the task will run. When you set the start time of the first run, all subsequent runs will be offset from this time, so if this time is midnight and the interval is once a day, the task will always run at midnight.
  • Interval Type -The interval at which this task will reoccur
  • Interval count – The number of intervals in between each run, so if you would like the task to run every 2 hours, you would set the interval type to Minutes and the Interval count to 120. Setting this parameter to 0 will disable to task.

Logs

The logs tab is only available to a SuperUser.

There are two ways to access the logs.

  • The logs tab (opens in a new browser tab) displays all log entries from the Data Collector widget:
  • You can also view the logs from a specific run. Bring up the Old Uploads screen, each row contains a Logs button, which when clicked will display, in a separate tab, all the log entries from that particular run.

Widget Customisation

Value Default Comment
Widget Heading File Import Enter the name of the widget as it is to appear in the heading.
File Type Test Data Collector class that is associated with the widget. All files received by the widget are processed using this class. The contents of this list is controlled by Menu>Data Collector Framework.
Refresh Period No widget refresh Select a refresh period from the drop down menu from between 15 seconds to 1 hour, or no refresh at all.

Installing New Data Collector Classes

New Data Collector classes can be created by implementing the interface 'com.c3.bigguy.interfaces.IDashFileImporter4' which allows the background process to hook into the Data Collector and process the file.

The Menu>Data Collector Framework page takes care of installing the new Data Collector into the system. It also provides feedback about the version of Data Collector classes installed.

The page also checks the jar files for duplicate classes. Useful for checking for multiple versions of the same File Importer.